Cream of the Crop 1
Cream of the Crop 1.iso
next >
Text File
911 lines
Find Duplicates
John E. Bean, P.E.
Version 5.00
May 1991
Copyright 1988, 1991 - JB Technology Inc.
INTRODUCTION . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
GETTING STARTED . . . . . . . . . . . . . . . . . . . . . . . . . . 2
SEARCH METHODS . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
USING FIND DUPLICATES . . . . . . . . . . . . . . . . . . . . . . . 9
PROGRAM FILES . . . . . . . . . . . . . . . . . . . . . . . . . . .13
THE JBT TOOLS . . . . . . . . . . . . . . . . . . . . . . . . . . .14
Find Duplicates is an utility program for IBM PCs equipped with
hard disk drives. Find Duplicates will search a specified
drive(s) and find all duplicate files.
Once the duplicate files have been found, they are displayed on
the screen. Files can be tagged for deletion, viewed, printed or
Find Duplicates was written in Turbo Pascal 5.0, utilizing Turbo
5.0 Professional by Turbo Power Software.
The program requires 256K of RAM and a hard disk.
To run Find Duplicates in its simplest form, type FD at the Dos
Find Duplicates has twelve (12) "Command Line" parameters. A
"Command Line Parameter" are options which can be typed
immediately after FD. Entering command line parameters will
preload the Dialog Box. The "Command Line" parameters are:
----------- ----------------------------------------------
/D=<drives> This command line parameter is used when you
wish to use either multiple drives or find du-
plicate files on a drive other than the one
you are currently logged on.
Example 1: FD /D=D
In this example, Find Duplicates will scan
DRIVE D for duplicates files.
Example 2: FD /D=D,c
In this example, Find Duplicates will scan
both DRIVE D & C for duplicates files. If
a file on drive C is duplicated on drive
D, Find Duplicates will find it.
Example 3: FD /D=ALL
In this example, Find Duplicates will scan
ALL the disk drives on your computer, be-
ginning with drive A.
/BW This command line option will tell Find Dupli-
cates to display everything on the screen in
black and white. Find Duplicates can tell if
your computer has a color graphics card, but
it can not tell if your computer has a color
monitor or not. This option should be used if
you are using a laptop computer.
/RO This command line parameter should be used
with care! The /RO command line parameter will
include READ ONLY files when Find Duplicates
searches the specified drives.
/A This command line parameter will include files
that are in ARC files.
/Z This command line parameter will include files
that are in ZIP files.
/VW=<File> Find Duplicates comes with a program entitled,
VIEW.COM. This program is used when you want
to view a duplicate file. There are many file
viewing programs that can be used. If you have
a file viewing program you rather use, this
command line option is to be used.
Example 1: FD /VW=C:\LIST.COM
In this example, the file viewing
program LIST.COM is located on the
root directory of drive C. When you
depress the letter V to view a file,
the program LIST.COM will be used
instead of VIEW.COM.
/PARTIAL This command line parameter will scan a
partial list of files.
/SLOW This command line parameter will start in the
"SLOW" scanning method.
/SLOWER This command line parameter will start in the
"SLOWER" searching method.
/SLOWEST This command line parameter will start in the
"SLOWEST" searching method.
/PASS=<#> This command line parameter is used only with
the SLOW, SLOWER and SLOWEST searching method.
Each of above methods has several plans. (See
/SD=<Drive> This command line will tell Find Duplicates
which drive to store temporary files when
using either the SLOW, SLOWER or SLOWEST
searching methods. (If this command line
parameter is not used, Find Duplicates will
store the temporary files on the drive with
the most free space.)
The following are valid examples illistrating the use of command
line parameters.
Example 1:
You are logged onto drive C and want to find all the
duplicate files on drive C.
C:>FD <Enter>
Example 2:
You are logged onto drive C and want to find all
duplicates files on drive D and want the output to be
in black and white.
C:>FD /D=D /BW <Enter>
Example 3:
You are logged onto Drive C and want to find all
duplicates files on drive C and D, the output to be in
black and white and use the SLOW method of scanning.
C:>FD /D=C,D /BW /SLOW <Enter>
Find Duplicates can locate duplicate files two (2) different
ways. The following is a description of the two ways of finding
duplicates files.
Programs are loaded into the memory that is in the
first 640K of RAM. Most programs do not use all of this
memory. In the programming language that I write in,
the memory remaining is called "HEAP".
The HEAP method of finding duplicates stores all the
information that is required in available HEAP. This
includes a list of all the drives/subdirectories, list
of all ZIP/ARC files, all file names and the duplicate
file information.
Depending on how many files you have, and the amount of
the HEAP available, this method may or may not be
I have figured out roughly how many files Find
Duplicates can scan. First, run CHKDSK to see how much
free RAM space is available. The following formula will
give you a rough estimate:
Free Space - 137160
--------------------- = Number of Files
Running CHKDSK indicates 585088 free bytes.
585088 - 137160
--------------------- = 10,665 Files
If the drives you are going to scan contains more than
the calculated number of files, then you will be unable
to use the FAST - FULL method of scanning.
The disk method stores all the data, except the
duplicate file list, in temporary disk files. Find
Duplicate will automatically scan all your drives, to
determine which drive has the most free space, to store
these files. You can use the /SD=<Drive> command line
parameter to specify the drive to use. (If you have a
RAM disk, you may wish to use it for speed)
This method is slower than the HEAP method, but if you
have a large amount of files, you will have to use this
The Find Duplicates Dialog Box allows five (5) SEARCHING METHODS. The
following are each method and a description of each.
This method will search all the specified drive(s) for all duplicate
files using the HEAP scanning method as described. If you run out of
memory, the program will display an error message indicating the
amount of files read up to that point.
You can then run Find Duplicates again, using one of the following
searching methods.
This method will search all the specified drive(s) for duplicate files
within the range that you specify (i.e. A thru D) using the HEAP
scanning method.
If you run out of memory, the program will display an error message
indicating the amount of files read up to that point.
You can then run Find Duplicates again, using one of the other
searching methods described below.
This is the first of the three (3) searching methods that uses the
DISK scanning method. Each of these methods contains a plan with
multiple passes.
Find Duplicates will search all the duplicate files with who's first
character is within the characters specified in that pass. If any
duplicate files are found within that pass, they are displayed as
described in the section USING FIND DUPLICATES. When all appropriate
actions are taken, the next pass is perform. This process is repeted
until all the passes are made, or Find Duplicates is aborted by
depressing ESC while reading files from the disk or sorting the
temporary disk file.
For the SLOW searching method the following are the passes and their
associated characters which the files are compared.
---- ---------------
1 A thru M
2 N thru Z
3 0 thru 9
4 ! thru /
5 : thru @
6 [ thru ~
The three DISK methods can start at a pass other than pass number one
(1) by using the /PASS=# command line parameter as described in the
This is the second searching method which uses the DISK scanning
Find Duplicates will search all the duplicate files with who's first
character is within the characters specified in that pass. If any
duplicate files are found within that pass, they are displayed as
described in the section USING FIND DUPLICATES. When all appropriate
actions are taken, the next pass is perform. This continues until all
passes are made, or Find Duplicates is aborted by depressing ESC while
searching for duplicate files.
For the SLOWER searching method the following are the passes and their
associated characters which the files are compared.
---- ---------------
1 A thru D
2 E thru H
3 I thru L
4 M thru P
5 Q thru T
6 U thru Z
7 0 thru 4
8 5 thru 9
9 ! thru @
10 : thru @
11 [ thru ~
The SLOWER scanning method can start at a pass other than pass number
one (1) by using the /PASS=# command line parameter as described in
the section GETTING STARTED.
This is the third and slowest of the DISK method. Each pass consist of
a single character. The first pass is the ASCII character 33,(!) and
the last pass is the ASCII character 126 (~). That means this
searching method will search the disk drive ninety four (94) times.
For the SLOWEST searching method the following are the passes and
their associated characters which the files are compared.
---- ---- ---- ---- ---- ---- ---- ----
1 - ! 25 - 9 49 - Q 73 - i
2 - " 26 - : 50 - R 74 - j
3 - # 27 - ; 51 - S 75 - k
4 - $ 28 - < 52 - T 76 - l
5 - % 29 - = 53 - U 77 - m
6 - & 30 - > 54 - V 78 - n
7 - ' 31 - ? 55 - W 79 - o
8 - ( 32 - @ 56 - X 80 - p
9 - ) 33 - A 57 - Y 81 - q
10 - * 34 - B 58 - Z 82 - r
11 - + 35 - C 59 - [ 83 - s
12 - 36 - D 60 - - 84 - t
13 - - 37 - E 61 - ] 85 - u
14 - . 38 - F 62 - ^ 86 - v
15 - / 39 - G 63 - _ 87 - w
16 - 0 40 - H 64 - ` 88 - x
17 - 1 41 - I 65 - a 89 - y
18 - 2 42 - J 66 - b 90 - z
19 - 3 43 - K 67 - c 91 - {
20 - 4 44 - L 68 - d 92 - |
21 - 5 45 - M 69 - e 93 - }
22 - 6 46 - N 70 - f 94 - ~
23 - 7 47 - O 71 - g
24 - 8 48 - P 72 - h
When writing Find Duplicates, I assumed that anybody using this method
will use the /PASS=# command line parameter to specific where to begin
the search.
For example, if you want to begin looking for duplicate files starting
with the letter C then you would enter the following;
Once Find Duplicates has been evoked, as described in "GETTING
STARTED", a Dialog Box will appear. The following is the Find
Duplicates Dialog Box.
+-------------- Find Duplicates --------------+
| |
| Drives to Scan Search Method |
| D (∙) Fast Full |
| Search Options ( ) Fast Partial |
| [ ] Include Read Only ( ) Slow |
| [ ] Look Inside ARCs ( ) Slower |
| [ ] Look Inside ZIPs ( ) Slowest |
| |
| OK ▄ |
| ▀▀▀▀▀▀▀▀ |
The Dialog Box has four (4) fields. To traverse between the
fields, depress the TAB key to go forward and SHIFT TAB
combination to go backwards. Depressing the ESC key in any of the
fields will abort the program.
The following will list the four fields and describe what their
functions are and how to enter data.
This field is where you enter which drives you want Find
Duplicates to search. Each drive letter is should be separated
by a comma.
If you want Find Duplicates to search all the drives system,
enter ALL.
Entering the drives utilizes a line editor which has many
features that allow flexibility when entering the drive(s).
The editing commands are:
-------- ------------------------------------------
<Enter> Accept Line
Esc Quit without changing line
Left Arrow Cursor left one character
Right Arrow Cursor right one character
^Left Arrow Cursor left one word
^Right Arrow Cursor right one word
Home Cursor to beginning of line
End Cursor to end of line
Del Delete character under cursor
Backspace Delete character left of cursor
-------- ------------------------------------------
^End Delete to end of line
^Y Delete entire line
^Home Delete from beginning of line
^T Delete word to right of cursor
Ins Toggle insert mode
^R Restore original contents of line
Some valid examples of DRIVES TO SCAN are:
Once the drives have been entered, depress the TAB key to go
to the SEARCH OPTIONS field or depress SHIFT TAB key
combination to go to the OK field.
The SEARCH OPTIONS field is a CHECK BOX field. The Find
Duplicates CHECK BOX field contains three options.
Depressing the SPACE BAR will "toggle" the state of the
option. If an "X" appears within the "[ ]", then the option
is "toggled" ON.
The CHECK BOX options are:
Read Only files are files that can be read from the disk
but can not be written to or deleted. If this option if
"toggle" ON, then Find Duplicates will list a read only
file as a duplicate file if applicable. If listed as a
duplicate file, and marked for deletion, Find Duplicates
WILL DELETE a read only file.
ARC files are files with the extension *.ARC. An ARC file
can have numerous files compressed in a single ARC file.
If "toggled" ON, Find Duplicates will include the files
within an ARC file as part of the scanning process.
If a file within an ARC is marked for deletion, and Find
Duplicates knows where the programs ARC or PKPAK are
located, the files within the ARC file will be deleted.
ZIP files are files with the extension *.ZIP. A ZIP file
can have numerous files compressed in a single ZIP file.
If "toggled" ON, Find Duplicates will include the files
within a ZIP file as part of the scanning process.
If a file within a ZIP is marked for deletion, and Find
Duplicates knows where the program PKZIP is located, the
files within the ZIP file will be deleted.
Use the ARROW KEYS and the HOME and END KEYS to move between
options. Once you have "toggled" the options to your
satisfaction, depress SHIFT to go to the SEARCH METHOD field
or depress the SHIFT TAB combination to go to the DRIVES TO
SCAN field.
field allows ONLY ONE option to be selected. A "∙" is
displayed in the option that is "toggled".
The SEARCH METHODS were discussed in detail in the section
Use the ARROW KEYS and the HOME and END KEYS to move between
options. As you move between options, the "∙" will move to
that option. Once you have moved the "∙" to the desired search
method, depress SHIFT to go to the OK field or depress the
SHIFT TAB combination to go to the SEARCH OPTIONS field.
The OK field is a verification field. There are only two keys
that can be pressed. SHIFT will inform Find Duplicates that
have been correctly completed and to begin finding duplicate
files. The SHIFT TAB combination will go to the SEARCH METHOD
Once the Dialog Box has been completed, Find Duplicates will
begin scanning the specified drive(s) searching for duplicate
Depending on the method of scanning, heap or disk, informational
screens will appear to visually inform you on what is going on.
After finding the duplicates files a display screen will appear.
On the screen the duplicate files are displayed, separated by a
blank line. The following are the keys and actions that can be
---------- ----------------------------------------
Down Arrow Move highlight bar down one file.
Up Arrow Move highlight bar up one file.
PgDn Move highlight bar down 18 files.
PgUp Move highlight bar up 18 files.
Home Move highlight bar to first file.
End Move highlight bar to last file.
Gray Plus,
T Tag highlighted file for deletion.
Gray Minus,
U Untag highlighted file.
SpaceBar Change the tag state of the highlighted
file. If the highlighted file is
"tagged" then the file will be
"untagged". If the highlighted file is
"untagged" then the file will be
ALT U Untags all the files.
S Shows the status. The status includes
the number of duplicate files found and
their total file size, and the number of
tagged files found and their total file
R Rename the current file.
P Print duplicate files to printer.
V View highlighted file using VIEW.COM
ALT O Temporarily suspend Find Duplicates,
exit to DOS in the drive/subdirectory
where the highlighted file is located.
F1 Display Help Screen.
F10 Erase tagged files.
ESC Exit Find Duplicates.
Once all the desired duplicate files have been tagged, depress
function key F10. You will be asked if you are sure you want to
erase the tagged files. If you depress "Y" the tagged files will
be erased, a screen will appear listing the number of files that
were erased and the amount of disk space recovered, and the
program will be terminated.
That is all there is to Find Duplicates. It was designed to be
easy but powerful. I hope you enjoy this program and put it to
good use.
If this program is not part of the JBT TOOLS then the following
files should be included on the disk.
-------- -----------------------------------------------
FD.EXE The Find Duplicates program.
FD.DOC This file.
VIEW.COM A file viewing program used with Find
INVOICE.DOC How to order Find Duplicates.
The JBT ToolBox
The program you are using is one of several programs that I have
written that are either public domain or shareware.
These programs have appeared on bulletin boards, distributed by
ShareWare clearing houses and are included in several of the
Dvorak-Osborne series of books.
Most users of these programs are not aware that the other
programs that I have written exist.
I have created The JBT Tools to provide an interface that allows
all my programs to be used.
The following is the programs included in the JBT Tools and a
brief description of each program.
EzDoss is a Dos Shell. As with all Dos shells actions
can be taken on multiple files. I have used many Dos
shells in the past and they have many features that are
great. Unfortuneatly, all these features are not
included in one shell. Therefore, I wrote EzDoss to
include the features I like the best. Some of the
features are:
When copying or moving files and there is a
file with the same name on the destination,
most Dos Shells offer only "Overwrite"
EzDoss provides several options. They are:
This option is typical. It will
overwrite any the file on the
destination with the file on the
This option will copy/move the file
to the destination only if the file
on the destination is OLDER than
the file on the source.
This option will rename the file on
the destination and copy the file
from the source.
This option will rename the source
file, then copy the file to the
These options provide the most flexible
options when copying or moving files.
Tagging Files:
Multiple files can be "tagged". Besides
taggin indvidual files, all files can be
tagged, files can be tagged by specifying a
"pattern" or by selecting a date in which
files newer, older, same or a combination
will be tagged.
Help System:
EzDoss's manual is online. Help for the
current function can be displayed by
depressing function key F1. Depressing Alt F1
will display the list of help "topics". There
are over fifty (50) help "topics".
Zip Management:
Many of public domain/shareware users acess
buletin boards. Files downloaded from these
boards are compressed into a "ZIP" file.
EzDoss allows files to be "zipped", "zipped"
files to be "unzipped" and to display files
withing a "zip". Those files displayed within
a "zip" can be "tagged" and "unzipped",
viewed on the screen or deleted from the zip
Disk Sort:
When files are displayed, they can be sorted
eight (8) different ways. This does not sort
the files on the disk.
EzDoss DOES provide a public domain program
that can be called from within EzDoss that
will sort the disk.
EzDoss comes with EzFind. EzFind will search
any non executeable files for a string. This
can be used to find a key word within
wordprocessing files that are not Ascii.
These are just a few of the many features that EzDoss
has. I use this program every day, and I am sure you
will find it as useful.
EzEdit is a text editor. The file to be edited/created
is diplayed in a window. EzEdit can have multiple
windows opened.
A block of text can be copied to the "clipboard" where
the text can be copied/moved into other files.
Windows moveable and resizeable. A pop up calculator
and Ascii chart also come with the program.
EzView will display an Ascii file in a window. EzView
can have multiple windows opened.
Windows are moveable and resizeable.
EzLocate is a file finder program. Multiple drives can
be scanned for files which meet the user specified
masks. Multiple masks can be specified.
EzLocate can also look inside ARC and ZIP files for the
user specified masks.
Find Duplicates:
Find Duplicates is a program that can scan multiple
disk drives and find all the files with duplicate file
names. Once the drive(s) have been scanned, the
duplicate files are displayed on the screen. Each file
can be viewed (ASCII only), renamed, supsend to DOS in
that file's subdirectory or "tagged" for deletion.
Find Duplicates can list duplicate files that are
stored in ARC and ZIP files. If your PC has PKZIP and
ARC or PKUNPAK, those duplicate files can be deleted
within the ZIP or ARC files.
Back Off!:
Back Off! is an utility that you will use on a weekly
basis. Back Off! is a multiple file extension deletion
program. It will search any and all drivs for files
ending with extensions matching a previously prepared
list. (i.e. .BAK, .CHK, .$$$).
Back Off! will show you the matching filenames and
their locations, and give you the option of viewing
them and/or "tagging" and deleting them.
When I first used a MsDos PC, I wrote small batch files
using COPY CON command from the Dos prompt. If I had a
dime for every time i goofed and had to start over
again, I would be a rich man. I wrote CopyCon to help
me write small batch files.
CopyCon has grown since version 1.00 and now not only
allows simple batch files to be created/edited, but
allows graphic boxes, Ascii characters to be inserted
and line centering.
One of the most important features of CopyCon is the on
line help. Not only does the help provide information
on how CopyCon works, but it is a reference on batch
file commands including examples.
Also included are two small programs that can be used
in batch files and two useful batch files. They are
STATUS.BAT which will read your PC and list out its
status. The other batch file is MENU.BAT which is a
menuing system that is all set up.
You will find that these programs will increase your
productivity and make using your PC easier.
The following page is an INVOICE that you can order the JBT
Tools from:
Remit to: From:
JB Technology Inc. __________________________
28701 N. Main St.
Ridgefield, Wa. 98642 __________________________
(206) 887-3442
Contact Individual
Quantity Unit Price Total
_______ JBT Tools Software $35.00 ___________
_______ JBT TOols Manual $20.00 ___________
(Laser Quality)
Total ___________
I want 5 1/4" _______ 3 1/2" _______ diskette. (Check One)
Note that Back Off! computer software has been delivered and ac-
cepted by the customer. Upon receipt of paid invoice, a current
disk will be sent.
After the July 4th, 1991 weekend, I will have a fax machine to
take orders and comments. I am using a phone switch box so the
phone number at the top of the page is also the FAX number.
After the first ring, press the SEND on your FAX. This will allow
the switch box to switch to my Fax machine.
At that time I will be set up to acept VISA and MASTER CARD. If
you want to pay by either, fill out the following:
Card Number: ______ ______ ______ _______ Exp. Date ____/____
Type of Card: Visa _____ Master ______ (Check One)